perm filename RIPPLE[DIS,DBL]2 blob
sn#219344 filedate 1976-06-12 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00007 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 .ASEC(Rippling to Locate Relevant Heuristics)
C00006 00003 .ASSEC(The Domain of Applicability of a Heuristic)
C00008 00004 .ASSECP(Efficiently Locating Specializations)
C00020 00005 .ASSECP(Efficiently Locating Examples)
C00033 00006 .ASSECP(Efficiently Locating Relevant Heuristicss)
C00042 00007 .ASSECP(Independence Assumptions)
C00049 ENDMK
C⊗;
.ASEC(Rippling to Locate Relevant Heuristics)
<<Delete or shorten this RIPPLE appendix>
This appendix
discusses in excruciating
detail how AM is able to zero in on the relevant heuristic rules,
once a task has been selected from the agenda.
Each heuristic H has a domain of applicability of the form "H applies to
all examples of C". The author identified the proper concept C for each heuristic
AM possessed, and tacked that heuristic onto that concept C.
This was all preparation. As AM is running, it will choose a task dealing
with, say, concept X.
How does AM locate the heuristic rules which are relevant to X:
by a process called ⊗4rippling upward from X in the Isa direction⊗*.
.ASSEC(The Domain of Applicability of a Heuristic)
AM contains hundreds of heuristics. In order to save time (and to
make AM behave more rationally), each heuristic
should only be tried in situations where it might apply, where it makes sense.
The key observation is that a heuristic typically applies to
"all examples of a particular concept C".
In general, then, we tack each heuristic to the
concept C whose examples it applies to.
So the whole problem of locating relevant heuristics has been reduced to the
problem of efficiently finding all concepts of which C is an example
(for a given concept C). This relationship will be denoted "ISA"$$
i.e., "C ISA X" iff "C is an example of X"
iff "X ε ISA's(C)". $.
As we shall see shortly, the
task of locating all ISA's of C
is closely related to the efficient location of all generalizations and
specializaiions of a given concept.
The next subsections will deal with those problems: how AM quickly assembles a list
all the concepts which are generalizations of (or examples of, or
specializations of, or ISA's of) a given concept.
This process will be known as "⊗4rippling⊗*" away from the given concept,
in the Generalizations (or the Specializations, or the Isa's, or the Exs) direction.
.ASSECP(Efficiently Locating Specializations)
In the last subsection, we saw how nice it would be to be able to tell
quickly what all the generalizations, specializations, isa's$$
"x ISA y" iff x is an example of y. The ISA link points from
a concept x to all concepts y which claim x as an example.$, and
examples of a
given concept are. Below is the scheme AM uses to do this.
The most obvious scheme is to store this information explicitly.
So the Examples facet of C would point to all known examples of C, etc.
But there is one crucial (though simple) idea that prevents this:
One can substitute a modest amount
of processing time (via chasing links around) for the vast amount of storage space
that would be needed to have "everything point to everything".
We'll cover this idea slowly, although many readers will already
grok this, and can just skim this section.
Many other readers won't care about this space-saving idea, and may wish to
skip this and move immediately on to the next chapter.
One of the facets a concept may have is "Specializations".
For example, the Specializations facet of Numbers might be the list
(Primes Odds Evens Squares Maximally-divisibles Perfects Amicables).
The Specializations facet of Primes might point to Odd-primes, and ⊗4its⊗*
Specializations facet might contain the entry "Mersenne-primes".
And yet Mersenne-primes are clearly a specialized kind of Number, just as
Primes are. Should the Specializations facet of Numbers point to
them as well? Can't we easily "deduce" that Mersenne-primes are a specialization
of the concept Numbers?
To deduce that, one must realize that a specialization of a
specialization of X is itself a specialization of X. If X.Spec denotes
all the entries on the Specializations facet of X, then
(X.Spec).Spec
will mean all ⊗4their⊗* specialization entries,
and will be written
X.Spec↑2.
The necessary deductive knowledge can be stated as: X.Spec↑n are
specializations of X (for any n).
The function Specializations(X) will be used to signify the results of this
repeated gathering: X.Spec ∪ (X.Spec).Spec ∪...
In fact, this coincides with our intutitive notion of what is -- and isn't --
a specialized kind of X.
Graphically, we may signify a Spec relationship by a
downward slanting arc.
Here is a portion of the network of
Spec links to and from these concepts:
.B0 INDENT 25
Object
\
\
\
\
\
\
Number
/ \
/ \
/ \
/ \
/ \
/ \
Odd-numbers Primes
\ / \
\ / \
\ / \
\ / \
\ / \
\ / \
Odd-primes Even-primes
\
\
\
\
\
\
Mersenne-primes
.E
It is clear by inspection$$ Visually. It is slower -- but still
quite fast -- to compute this given a nongeometric representation: a
list of nodes with
link fields. $ that Mersenne-primes are a specialization of Numbers.
Also, we see that Mersenne-primes is ⊗4not⊗* known to be a specialization of
Even-primes, since one has to go up and then downward again to travel
along the graph from one of those concepts to the other.
X is a specialization of Y iff you can leave the node X and move
steadily upward (along the slanting arcs) to reach node Y.
Y is a generalization of X iff you can leave node Y and move steadily
downward to reach X. So Object is a generalization of ⊗4all⊗* the nodes
shown in the little graph above.
The diagram is a graphical presentation of the contents$$
Only a portion of the contents. This is all we need see at the moment, for
illustrative purposes. In later chapters, we'll look into these concepts in
greater detail and completeness. $
of the Specializations and Generalizations
facets of the concepts shown.
Each arc actually stands for two: each Spec arc from A to B
has a twin arc from B to A, labelled "Genl", for Generalizations.
Thus one picture can represent both networks$$ One says that the
nodes are doubly-linked. $. When one
traces upwards (downwards) from a particular node, he can reach precisely
the set of generalizations (specializations) of that node.
Let's state definitely what we mean when we say that
a concept A is a ⊗4specialization⊗* of a
concept B: we mean that A could be defined as ⊗4"both B.Defn and P"⊗*,
where ⊗4B.Defn⊗* is
some definition of B, and the ⊗4P⊗* is any predicate.
For example, the definition of Odd-primes can be stated as
"Odd-prime(x) iff both Prime(x) and Odd-number(x)". So Odd-primes is a
specialization of Primes, with P=Odd-number$$ Odd-primes is also a specialization
of Odd-numbers, with P=Prime. $.
Similarly, Mersenne-primes is a specialization of Numbers, since we can write
"Mersenne-prime(x) iff both Number(x) and (both Prime(x) and (both Odd(x)
and x is of the form 2↑n-1))". That last definition is of the form
"both Number(x) and [...]", so Mersenne-primes is a specialization of Numbers.
As we move down the graph, each node "adds" one more conjunct onto the definition
of the node above it. As we move from Primes to Odd-primes, the extra conjunct is
the predicate Odd(x). One can now see an efficient algorithm for finding out what
the unknown item x is. We start at the top of the graph, and run Object.Defn
on it. If it passes that test, we run the "additional" predicate that would make
it a Number. If that succeeds, we run the additional tests for Odd-numbers and
for Primes. In general, when x passes the test at node N, we run the tests of all
the nodes immediately below N. When this process terminates, all the nodes which
passed x can claim x as an example. This is cute, and efficient in some sense,
but it still requires a fair amount of processing to assemble the list of all
concepts of which x is an example. We can do this with much less processing, as
the reader will soon see.
We formally define the relationship
"Generalization" to be the converse of Specialization.
To find all the generalizations of a concept C, we must follow all of C's Genl
links (all the lines
emanating upward from C) and collect a list of all the nodes they terminate in,
then follow
the Genl links from ⊗4those⊗* nodes, etc. This process will quickly halt$$
Each link points to a higher node,
and there is a well-defined top node, "Anything", and there are no cycles.
The longest path from a node to Anything is only about log↓B(N), when there are N
nodes evenly distributed in a tree with B branches per node. In
the current incarnation of AM,
the
averge length this path is only about 5. $.
We shall call this process "⊗4rippling in the Genl direction⊗*".
The end result of this rippling will be a list of all known generalizations
of concept C.
Analogously, we can "⊗4ripple in the Spec direction⊗*", tracing downward along the
Spec links, to recover all specializations of a given concept.
This is guaranteed to terminate only because the number of nodes is finite.
Now that we've evolved a fast way to find Specializations(X) and Genralizations(X),
let's see if we can do the same for Examples(X) and ISA's(X).
Even though this is cute and interestingin its own right, let's not forget that
the reason for all this work is to derive a fast way a computing ISA's(X),
since the heuristics relevant to C are precisely those tacked onto concepts in
ISA's(C).
.ASSECP(Efficiently Locating Examples)
.EXISA: PAGE
We're now ready to formally define what we mean when we say that A is an
⊗4example⊗* of concept B.
This relationship holds iff
A satisfies
the definition of B.
For example, "3" satisfies the first entry stored in the Definitions facet
of the Number concept, so it must be an example of a number. Similarly,
"3" is also an example of a Prime, an Odd, an Odd-prime, an Object, etc.
Notice that being an example of something is completely independent of
being a specialization of that thing:
.BN
λλ "Compose" is both a specialization of, and an example of, "Operation".
"Compose" is a valid example of an operation, since it satisfies
the definition of Operation. On the other hand, compositions form a
subclass of all operations; to be a composition, X must be an
operation and also must satisfy some stronger requirements. So
Composition is also a specialization of Operation.
λλ "Compose-Insert-and-Delete" is an example of "Compose", but not a specialization.
λλ "Compose-with-Self" is a specialization of "Compose", but not an example.
λλ "Sets" is not related to "Compose" via Spec, Genl, Exs, or ISA.$$
Of course they both have a common ancestor, Any-concept, but one has to
switch directions (up and then down again) to get from one concept to another
using only those four kinds of links. $
.E
We were able to make the following powerful statement about Specializations:
"The specializations of C are precisely those caught by rippling away from C
in the Spec direction$$
The specializations facet of C is denoted C.Spec.
The entries thereon are called Specs of
C. The larger set Specializations(C) contains not only C.Spec, but all specializations
of concepts in C.Spec. Specializations(C) is obtained by rippling away from C in the
Spec direction; that is, C.Spec ∪ (C.Spec).Spec ∪...
$". Can we make an analogous statement for Examples?
The analogous one would be "The examples of C are precisely those caught by
rippling away from C in the Exs direction$$
The examples facet of C is denoted C.Exs. The entries thereon are called Exs of
C. The larger set Examples(C) refers to all known concepts which satisfy the
definition of C. Thus Examples(C) contains C.Exs plus many more concepts (probably).
$". This is beautifully symmetric, but
utterly false!
If X satisfies the definition of Y, and Y satisfies the definition of Z,
then X may or may not (and usually ⊗4won't⊗*) satisfy the definition of Z.
Number satisfies the definition of Any-concept, and the atom "3"
satisfies the definition
of Number, but "3" certainly does ⊗4not⊗* satisfy the definition of Any-concept.
<Explain why 3 is not a concept>
After such a dismal failure, it is often helpful to ask
(we may learn something by asking)
"Why did it work for Spec and not for Exs?"
For Specializations, it worked because if x$$
For the remainder of this subsection, X,Y,Z will represent any three concepts,
and x,y,z will represent their definitions. P, Q, R are variables for predicates.
Recall that a∧b means "both a and b". $
has the form
y∧P, and y has the form z∧Q, then x has the form y∧P=(z∧Q)∧P = z∧(Q∧P).
We're talking now about the definitions x,y,z of three concepts X,Y,Z.
So because of the transitivity of "AND", we can show that if
X is a specialization of Y, and Y of Z, then X is certainly a specialization of Z.
But wait! If X is an example of Y, and Y is a ⊗4specialization⊗* of Z,
that means X satisfies Y's definition, which means that X satisfies z∧P,
(where z is Z's definition and P is some predicate),
so X satisfies z plus some other tests, so X satisfies Z's definition,
so X is an example of Z.
So we ⊗4do⊗* have a slightly analogous statement after all:
"Any example of Specializations(C) is an example of C".
This relation is not quite so symmetric as the first one; a heuristic in the
back of my mind urges me to examine the mirror image of this statement.
Namely, is it true that "Any specialization of an example of C is also an
example of C". Well, let's see. Suppose X is an example of Y, and Y is a
specialization of Z.
Since X is an example of Y, X satisfies y.
But y is of the form z∧P (since Y is a specialization of Z).
Thus X satisfies z∧P, so clearly X satisfies z. So X ⊗4is⊗* a specialization
of Z. So the statement is true.
We can put both of the above partial results together, into one necessary and
sufficient characterization of the (known, reachable) examples of a concept:
⊗6"Examples(C) are precisely the members of Specializations(Exs(Specializations(C)))."⊗*
What this means graphically is as follows. To find examples of C, start at node
C. Ripple as far as you want (not at all, if desired) in the Spec direction.
Then go in the Exs direction just once. Then ripple again in the Spec direction
as far as you wish.
Let's assume that a triple vertical line (⊗8α⊗⊗*) signifies a Exs link. Then
here is the graphical interpretation of the relationship just derived:
.B0 INDENT 20
C1
\
\
\ ⊗4any number of Spec links⊗*
...
\
\
C2
α⊗ ⊗4one Exs link⊗*
α⊗
C3
\
\
\ ⊗4any number of Spec links⊗*
...
\
\
C4
.E
.GROUP
C1 might coincide with C2, and C3 with C4. Using the simple diagram
below, the only examples of Operation which are pictured are Compose
and Compose-Intersect&Intersect. The latter is plucked in two
distinct ways, in fact:
.B0 INDENT 20
Any-concept
\ α⊗
\ α⊗
\ α⊗
\ α⊗
Activity
/ \
/ \
/ \
/ \
Predicate Operation
α⊗ / \ α⊗
α⊗ / \ α⊗
Equal / Composition
/ \
/ \
Dom⊗1=⊗*Range-op Compose-with-self
α⊗ α⊗
α⊗ α⊗
Compose-Intersect⊗1&⊗*Intersect
.E
.APART
Here is a list of all the relationships that can be derived from this
little graph, using the two rules:$$ RULE1: Specializations(x) = x.Spec↑n, and
RULE2: Examples(x) = Specializations(Exs(Specializations(x))). $
.BEGIN NOFILL SELECT 1 INDENT 0
Specializations(Any-concept) = (Activity, Predicate, Operation, Composition, Dom=Range-op,
Compose-with-self)
Specializations(Activity) = (Predicate, Operation, Composition, Dom=Range-op, Compose-with-self)
Specializations(Predicate) = NIL
Specializations(Equal) = NIL
Specializations(Operation) = (Composition, Dom=Range-op, Compose-with-self)
Specializations(Domain=Range-op) = NIL
Specializations(Composition) = (Compose-with-self)
Specializations(Compose-with-self) = NIL
Specializations(Compose-Intersect&Intersect) = NIL
Examples(Any-concept) = (Activity, Predicate, Operation, Composition, Dom=Range-op,
Compose-with-self, Equal, Compose-Intersect&Intersect)
Examples(Activity) = (Equal, Composition, Dom=Range-op,
Compose-with-self, Compose-Intersect&Intersect)
Examples(Predicate) = (Equal)
Examples(Equal) = NIL
Examples(Operation) = (Composition, Dom=Range-op,
Compose-with-self, Compose-Intersect&Intersect)
Examples(Domain=Range-op) = (Compose-Intersect&Intersect)
Examples(Composition) = (Compose-Intersect&Intersect)
Examples(Compose-with-self) = (Compose-Intersect&Intersect)
Examples(Compose-Intersect&Intersect) = NIL
.E
So AM can get by with a total of 12 links (LISP pointers), to store a total
of 37 relationships (if the "everything points to everything" scheme were used).
In toto, the actual LISP implementation has about 400 links, implicitly
storing a total of about 5000 relationships (any of which can be recovered almost
instantaneously, by using the formulae for Specializations and Examples.
.ASSECP(Efficiently Locating Relevant Heuristicss)
Each concept has facets which contain some heuristics. Some of
these are for filling in, some for checking, some for deciding the
interestingness, etc. For example, one Interest feature of Compose
says that a composition is more interesting if the range of the first
argument equals the domain of the second, and that a composition is
meaningless if those two sets don't even intersect.
One entry on the Interest facet of Operation says that an operation is
interesting if its domain and range coincide.
Suppose we want to judge the interestingness a concept like
"Compose-Intersect-and-Intersect"$$ The intersection of
3 sets. Look at the graph in the previous section,
to refresh your memory of how this relates to other concepts. $.
How can AM zero in on the relevant heuristics?
The key observation is: "A heuristic tacked onto C will be relevant iff
Compose-Intersect-and-Intersect is an Example of C".
Turning this around, we wish to compute ISAs(Compose-Intersect-and-Intersect).
That is, find all the concepts which claim it as an example. From our previous
work, we know how to do this:
⊗6ISAs(X) = Generalizations(ISA(Generalizations(X)))⊗*
Here is a nondeterministic way of performing that procedure:
Start at concept X, ripple upwards in the Genl
(Single-line link, opposite of Spec)
direction as much as
you please, go upwards once in the ISA (Triple-line link, opposite of Exs)
direction,
and then ripple again in the Genl direction.
More graphically, we can show this as follows:
.B0 INDENT 6
C4 ⊗4This concept claims X as an example⊗*
/
/
/
... ⊗4any number of "Genl" links⊗*
/
/
/
C3
α⊗
α⊗ ⊗4One single "Isa" link⊗*
C2
/
/
/
... ⊗4any number of "Genl" links⊗*
/
/
/
X ⊗4The given concept⊗*
.E
Some of these concepts might coincide.
The relation "generalization-of" is transitive (due to the transitivity
of "subset"), but the relation "is-an-example-of" is not transitive
(since "member" isn't transitive).
From the little graph on the preceding page, we can apply this formula to obtain:
.ONCE PREFACE 0 INDENT 0,30,4
ISAs(Compose-Intersect-and-Intersect) =
(Any-concept, Activity, Operation, Composition, Dom=Range-op, Compose-with-self).
As AM ripples upwards
from Compose-Intersect-and-Intersect,
AM gathers heuristics along the way. Several of these
have to do with why a Composition of any kind might be interesting;
several have to do with why any operation might be interesting, etc.
The two heuristics mentioned in the first paragraph are thus both located quickly.
Incidentally, both of them are satisfied, and say that this operation is interesting.
A few more general heuristics are encountered, tacked onto Any-concept, which
test some features that would make any concept interesting.
Instead of "judging interest", suppose we wish to "fill in examples" of that
bottom concept,
Compose-Intersect-and-Intersect.
How can AM find the heuristics that will result in the right kind of entries being
created/found/computed?
Again, the answer is to ripple upwards, gathering heuristics from each concept
encountered.
Any heuristic tied to a concept in ISAs(Compose-Intersect-and-Intersect) will be
relevant.
For example, heuristics tacked onto Compose explain how to use examples of its
arguments to find examples of the composition. Heuristic rules tacked onto
Operation explains how
to "run" the concept on randomly chosen domain elements, to derive
examples that way. Any-concept's heuristics say to instantiate the definition
of the concept, to symbolically construct examples directly from the
definitions of Compose-Intersect&Intersect.
In general, the suggestions of the more specific concepts will be
better than those of the general concepts. This is the old generality
⊗4versus⊗* power tradeoff. So AM begins executing the heuristic
rules, rippling upward, and quits after it's used up its time quantum
(its activation energy) for the current task. If it doesn't make it
all the way up to Anything, that's no great loss, since Anything has
such general information that it's almost never practical to use
it. (e.g., "⊗6To fill in examples of anything, pick random objects and
see if they satisfy the definition⊗*"). In production system
terminology, one may view ⊗4all⊗* the heuristics accessable by
rippling as "triggering", and the specific-first policy is simply
AM's answer to conflict resolution (how to decide which "triggered"
rule's right side actually gets executed next).
These linkages serve another purpose, besides providing an implicit
framework to guide the gathering of heuristic rules. Suppose one rule
asks for examples of Operations, during its course of execution. How
can AM quickly find all examples of operations? Well, a reasonable
answer is to ripple downward along the Spec direction for some time
(perhaps not at all), then go down the Exs link (only once), and then
ripple along Spec again as far as you please.
So the links are used for two purposes:
.BN
λλ To guide the gathering of heuristics. (In
order to actually excute a task once it's been chosen from the agenda;
in order to judge the worth of a given concept; etc.)
λλ As efficient ways to gather examples, specializaitons, etc.
of a given concept.
.E
.ASSECP(Independence Assumptions)
<< Should this sec. go in EVALU chapter? (Independence assumpts) >
A task dealing with concept X has been chosen. AM ripples upward
away from X, gathering relevant heuristics. They are executed in the
order they are encountered, so that the heuristics applying to very
general concepts will be executed last. This is because the more
specific a heuristic is, the more powerful it is ⊗4when it applies⊗*.
If a single concept yields up several relevant heuristics, they are
executed in the order they're stored on that concept.
No processing is done to re-order the set of all relevant heuristics.
There are two reasons for this:
.BN
λλ Empirically it wasn't necessary; the specific-first ordering was
always close to the best possible one. It's pointless to squander
cpu time.
λλ Theoretically, this group of heuristics can be run as a production
system, with no ordering whatsoever. Conflict resolution should not
be a crucial issue.
.E
The first reason holds because there is weakness in generality.
The rules tacked onto the specific concepts are almost always superior to those
tacked onto more general concepts.
The
second reason embeds the assumption that the heuristics do not
interact with each other. That assumption of independence (or
"linearity", if you prefer) is quite dangerous, and is not absolutely
true. Sometimes the left-hand-side of a heuristic rule will trigger
(or not) depending on what other heuristic rules have just been tried
(or not tried). For example, a typical left-hand-side might begin by
saying "⊗6If AM has just spent at least 3 seconds trying to
instantiate X.DEFN,...⊗*".
Or a rule might be worded as "⊗6If ↓_any_↓ examples of X exist, Then a
new one can be gotten by...⊗*".
In this case, the rule might fail just because it was the first rule tried.
To avoid just this kind of serendipity,
each rule is actually accessed twice. This is a very partial solution.
A detailed analysis of the problem, and an intelligent solution, are left
as open research tasks.
There is another kind of "linearity" assumption which is dangerous,
false, and yet we can get away with: How can we be sure that a
heuristic won't apply to a few scattered concepts, but not to any one
concept and its examples?
That is, how do we know that each heuristic has a nice domain of applicability?
We are associating each heuristic with a
particular concept. How do we know that we shouldn't have to
associate it in general with some random ⊗4subset⊗* of concepts?
In fact, we do assume that
the
associated concept generates the subset of relevant concepts:
namely, all examples of that concept. This simplistic assumption is ⊗4not⊗*
always true. For
example, sometimes, by analogy, a certain heuristic is used way out
its normal domain of applicability. Then the domain consists of two
isolated pieces of the graph of concepts, two regions that
can never be reconciled into one. It is untenable to
have to maintain all 2↑[200] subsets of 200 concepts, because of the
extreme magnitude of that number. The only answer for this dilemma is again
empirical: as a first approximation, there are very few heuristics
(and very few instances where they are crucial) which can't be
sqeezed into the simple "linear" mold.
If a heuristic is really applicable in a couple isolated places, AM will suffer
a duplicate of it to be maintained. In the LISP program, no such duplicates
were ever required.
Another linearity, not dealing with heuristics, was the assumption
that the top few jobs on the agenda rarely interact, so it rarely
matters which of them goes before the others, so AM never spends
cpu time trying to precisely order them: it always picks the very top
one and leaves it at that.
All these simplificiations, these "independence assumptions", or
linearizings, are the inklings of a science -- at least, of a model
-- of mathematical research. They imply what kinds of things can be
ignored, what factors don't couple, etc.